Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kgo: do not add all topics to internal tps map when regex consuming #838

Merged
merged 1 commit into from
Oct 15, 2024

Conversation

twmb
Copy link
Owner

@twmb twmb commented Oct 14, 2024

The internal tps map is meant to be what we store topicPartitions in that we are candidates to be consumed. This is filtered in assignPartitions to only opt-in partitions that are actually being consumed.

It's not BAD if we store all topics in that map, but it's not the intent. The rest of the client worked fine even with extra topics in the map.

When regex consuming, the metadata function previously put all topics into the map always. Now, we move the regex evaluation logic -- duplicated in both the direct and group consumers -- into one function and use that for filtering within metadata.

This introduces a required sequence of filtering THEN finding assignments, which is fine / was the way things operated anyway.

Moving the filtering to metadata (only in the regex consuming logic) means that we no longer store information for topics we are not consuming. Indirectly, this fixes a bug where GetConsumeTopics would always return ALL topics when regex consuming, because GetConsumeTopics always just returned what was in the tps field.

This adds a test for the fixed behavior, as well as tests that NOT regex consuming always returns all topics the user is interested in.

Closes #810.

@twmb twmb force-pushed the 810 branch 2 times, most recently from d518d4a to 454a611 Compare October 14, 2024 18:47
The internal tps map is meant to be what we store topicPartitions in
that we are candidates to be consumed. This is filtered in
assignPartitions to only opt-in partitions that are actually being
consumed.

It's not BAD if we store all topics in that map, but it's not the
intent. The rest of the client worked fine even with extra topics in the
map.

When regex consuming, the metadata function previously put all topics
into the map always. Now, we move the regex evaluation logic --
duplicated in both the direct and group consumers -- into one function
and use that for filtering within metadata.

This introduces a required sequence of filtering THEN finding
assignments, which is fine / was the way things operated anyway.

Moving the filtering to metadata (only in the regex consuming logic)
means that we no longer store information for topics we are not
consuming. Indirectly, this fixes a bug where `GetConsumeTopics` would
always return ALL topics when regex consuming, because
`GetConsumeTopics` always just returned what was in the `tps` field.

This adds a test for the fixed behavior, as well as tests that NOT regex
consuming always returns all topics the user is interested in.

Closes #810.
@twmb twmb added the patch label Oct 14, 2024
@twmb twmb merged commit d771ddf into master Oct 15, 2024
8 checks passed
@twmb twmb deleted the 810 branch October 15, 2024 00:38
ortuman pushed a commit to grafana/franz-go that referenced this pull request Oct 17, 2024
kgo: do not add all topics to internal tps map when regex consuming
ortuman added a commit to grafana/franz-go that referenced this pull request Oct 17, 2024
* Fix typo in kgo.Client.ResumeFetchTopics() docs

Signed-off-by: Mihai Todor <[email protected]>

* add `NewOffsetFromRecord` helper function

* Fix typo in Record.ProducerID doc comment.

* Don't set nil config when seeding topics in kfake cluster

Setting the configs to `nil` causes it to panic later when trying to alter the topic configs, as it only checks for entry in the map not being present, not for it being nil

Signed-off-by: Oleg Zaytsev <[email protected]>

* Add Opts method for sr.Client

* Merge pull request twmb#826 from colega/don-t-set-nil-config-when-seeding-topics-in-kfake-cluster

Don't set nil config when seeding topics in kfake cluster

* Merge pull request twmb#821 from seizethedave/davidgrant/producer-doc

Fix typo in Record.ProducerID doc comment.

* Merge pull request twmb#812 from mihaitodor/fix-doc-typo

Fix typo in kgo.Client.ResumeFetchTopics() docs

* kgo: fix potential deadlock when reaching max buffered (records|bytes)

Problem:
* Record A exceeds max, is on path to block
* Record B finishes concurrently
* Record A's context cancels
* Record A's goroutine waiting to be unblocked returns, leaves accounting mutex in locked state
* Record A's select statement chooses context-canceled case, trying to grab the accounting mutex lock

See twmb#831 for more details.

Closes twmb#831.

* all: unlint what is now cropping up

gosec ones are delibate; govet ones are now randomly showing up (and
also deliberate)

* Merge pull request twmb#832 from twmb/831

kgo: fix potential deadlock when reaching max buffered (records|bytes)

* kgo: misc doc update

* kgo: ignore OOOSN where possible

See embedded comment. This preceeds handling KIP-890.

Closes twmb#805.

* kip-890 definitions

A bunch of version bumps to indicate TransactionAbortable is supported
as an error return.

* kip-848 more definitions

Added in Kafka 3.8:
* ListGroups.TypesFilter
* ConsumerGroupDescribe request

* kip-994 proto

Only ListTransactions was modified in 3.8

* sr: add StatusCode to ResponseError, and message if the body is empty

Closes twmb#819.

* generate / kmsg: update GroupMetadata{Key,Value}

Not much changed here.

Closes twmb#804.

* kgo: do not add all topics to internal tps map when regex consuming

The internal tps map is meant to be what we store topicPartitions in
that we are candidates to be consumed. This is filtered in
assignPartitions to only opt-in partitions that are actually being
consumed.

It's not BAD if we store all topics in that map, but it's not the
intent. The rest of the client worked fine even with extra topics in the
map.

When regex consuming, the metadata function previously put all topics
into the map always. Now, we move the regex evaluation logic --
duplicated in both the direct and group consumers -- into one function
and use that for filtering within metadata.

This introduces a required sequence of filtering THEN finding
assignments, which is fine / was the way things operated anyway.

Moving the filtering to metadata (only in the regex consuming logic)
means that we no longer store information for topics we are not
consuming. Indirectly, this fixes a bug where `GetConsumeTopics` would
always return ALL topics when regex consuming, because
`GetConsumeTopics` always just returned what was in the `tps` field.

This adds a test for the fixed behavior, as well as tests that NOT regex
consuming always returns all topics the user is interested in.

Closes twmb#810.

* Merge pull request twmb#833 from twmb/proto-3.8.0

Proto 3.8.0

* kgo: support Kafka 3.8's kip-890 modifications

STILL NOT ALL OF KIP-890, despite what I originally coded.
Kafka 3.8 only added support for TransactionAbortable.
Producers still need to send AddPartitionsToTxn.

* kversion: note kip-848 additions for kafka 3.8

* kversion: note kip-994 added in 3.8, finalize 3.8

* kversion: ignore API keys 74,75 when guessing versions

These are in Kraft only, and are two requests from two separate KIPs
that aren't fully supported yet. Not sure why only these two were
stabilized.

* README: note 3.8 KIPs

* kgo: bump kmsg pinned dep

* Merge pull request twmb#840 from twmb/kafka-3.8.0

Kafka 3.8.0

* Merge pull request twmb#760 from twmb/753

kgo: add AllowRebalance and CloseAllowingRebalance to GroupTransactSession

* Merge pull request twmb#789 from sbuliarca/errgroupsession-export-err

kgo: export the wrapped error from ErrGroupSession

* Merge pull request twmb#794 from twmb/790

kgo: add TopicID to the FetchTopic type

* Merge pull request twmb#814 from noamcohen97/new-offset-helper

kadm: add `NewOffsetFromRecord` helper function

* Merge pull request twmb#829 from andrewstucki/sr-client-opts

Add Opts method for sr.Client

* Merge pull request twmb#834 from twmb/805

kgo: ignore OOOSN where possible

* Merge pull request twmb#835 from twmb/819

sr: add StatusCode to ResponseError, and message if the body is empty

* Merge pull request twmb#838 from twmb/810

kgo: do not add all topics to internal tps map when regex consuming

* CHANGELOG: note incoming release

* Merge pull request twmb#841 from twmb/1.18-changelog

CHANGELOG: note incoming release

* pkg/sr: require go 1.22

No real reason, no real reason not to. This also allows one commit after
the top level franz tag.

* Merge pull request twmb#842 from twmb/sr-1.22

pkg/sr: require go 1.22

* pkg/kadm: bump go deps

* Merge pull request twmb#843 from twmb/kadm

pkg/kadm: bump go deps

---------

Signed-off-by: Mihai Todor <[email protected]>
Signed-off-by: Oleg Zaytsev <[email protected]>
Co-authored-by: Mihai Todor <[email protected]>
Co-authored-by: Noam Cohen <[email protected]>
Co-authored-by: David Grant <[email protected]>
Co-authored-by: Oleg Zaytsev <[email protected]>
Co-authored-by: Andrew Stucki <[email protected]>
Co-authored-by: Travis Bischel <[email protected]>
mihaitodor added a commit to redpanda-data/connect that referenced this pull request Dec 12, 2024
This is required in order to pull in twmb/franz-go#838

This is needed because the `redpanda_migrator` input needs to
create all the matched topics during the first call to
`ReadBatch()`.

Signed-off-by: Mihai Todor <[email protected]>
mihaitodor added a commit to redpanda-data/connect that referenced this pull request Dec 16, 2024
This is required in order to pull in twmb/franz-go#838

This is needed because the `redpanda_migrator` input needs to
create all the matched topics during the first call to
`ReadBatch()`.

Signed-off-by: Mihai Todor <[email protected]>
mihaitodor added a commit to redpanda-data/connect that referenced this pull request Dec 16, 2024
This is required in order to pull in twmb/franz-go#838

This is needed because the `redpanda_migrator` input needs to
create all the matched topics during the first call to
`ReadBatch()`.

Signed-off-by: Mihai Todor <[email protected]>
mihaitodor added a commit to redpanda-data/connect that referenced this pull request Dec 16, 2024
This is required in order to pull in twmb/franz-go#838

This is needed because the `redpanda_migrator` input needs to
create all the matched topics during the first call to
`ReadBatch()`.

Signed-off-by: Mihai Todor <[email protected]>
mihaitodor added a commit to redpanda-data/connect that referenced this pull request Dec 31, 2024
This is required in order to pull in twmb/franz-go#838

This is needed because the `redpanda_migrator` input needs to
create all the matched topics during the first call to
`ReadBatch()`.

Signed-off-by: Mihai Todor <[email protected]>
mihaitodor added a commit to redpanda-data/connect that referenced this pull request Jan 3, 2025
This is required in order to pull in twmb/franz-go#838

This is needed because the `redpanda_migrator` input needs to
create all the matched topics during the first call to
`ReadBatch()`.

Signed-off-by: Mihai Todor <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

GetConsumeTopics returns all topics when consuming via regex, not just the topics that are being consumed
1 participant